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Abstract 

We  present  a  framework  for  the  modeling  of  multipath  rout¬ 
ing  in  connectionless  networks  that  dynamically  adapt  to  network 
congestion.  The  basic  routing  protocol  uses  a  short-term  metric 
based  on  hop-by-hop  credits  to  reduce  congestion  over  a  given 
link,  and  a  long-term  metric  based  on  end-to-end  path  delay  to 
reduce  delays  from  a  source  to  a  given  destination.  A  worst-case 
bound  on  the  end-to-end  path  delay  is  derived  under  three  archi¬ 
tectural  assumptions:  each  router  adopts  weighted  fair  queueing 
(or  packetized  generalized  processor  sharing)  service  discipline 
on  a  per  destination  basis,  a  permit-bucket  filter  is  used  at  each 
router  to  regulate  traffic  flow  on  a  per  destination  basis,  and  all 
paths  are  loop  free.  The  shortest  multipath  routing  protocol  reg¬ 
ulates  the  parameters  of  the  destination-oriented  permit  buckets 
and  guarantees  that  all  portions  of  a  multipath  are  loop  free. 

1.  Introduction 

Efficient  routing  results  in  smaller  average  packet  delays, 
which  means  that  the  flow  control  algorithm  can  accept  more 
traffic  into  the  network.  On  the  other  hand,  an  efficient  flow  con¬ 
trol  algorithm  rejects  excessive  offered  load  fhaf  would  necessarily 
increase  packet  delays  by  saturating  network  resources.  It  is  clear 
that  routing  and  congestion  control  are  very  much  interrelated. 

A  drawback  of  existing  internet  routing  protocols  is  that  their 
main  routing  mechanisms  (route  computation  and  packet  forward¬ 
ing)  are  poorly  integrated  with  congestion  control  mechanisms. 
More  specifically,  today’s  internet  routing  is  based  on  single-path 
routing  algorithms;  even  in  theory,  a  routing  protocol  based  on 
single-path  routing  is  ill  suited  to  cope  with  congestion,  because 
the  only  thing  the  protocol  can  do  to  react  to  congestion  is  chang¬ 
ing  the  route  used  to  reach  a  destination.  However,  as  has  been 
documented  in  [1],  allowing  a  single-path  routing  algorithm  to 
react  to  congestion  can  lead  to  unstable  oscillatory  behavior.  Fur¬ 
thermore,  for  connectionless  service,  any  datagram  offered  fo  fhe 
network  is  accepted;  although  routers  forward  packets  only  on  a 
best-effort  basis  and  drop  them  when  congestion  occurs,  the  steps 
taken  by  routers  occur  after  the  packets  have  been  allowed  to  con¬ 
gest  the  network,  and  it  is  up  to  the  transport  protocol  to  react  to 
congestion  after  network  resources  are  already  being  wasted. 

The  work  reported  in  this  paper  was  motivated  by  our  con¬ 
jecture  that  architectural  elements  similar  to  those  used  in  a 
connection-oriented  architecture  to  allow  the  network  to  enforce 
performance  guarantees  could  be  used  to  integrate  routing  with 
congestion  control,  and  to  provide  some  delay  guarantees  for  the 
delivery  of  those  datagrams  that  are  accepted  in  the  network.  We 
propose  a  new  framework  and  protocol  for  dynamic  multipath 
routing  in  packet-switched  networks  that  attempts  to  prevent  over 
utilization  of  network  resources  and  hence  congestion.  Packets 
are  individually  routed  towards  their  destinations  on  a  hop  by  hop 
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basis.  A  packet  intended  for  a  given  destination  is  allowed  to  enter 
the  network  if  and  only  if  there  is  at  least  one  path  of  routers  with 
enough  resources  to  ensure  its  delivery  within  a  finite  time.  In 
contrast  to  existing  connectionless  routing  schemes,  once  a  packet 
is  accepted  into  the  network,  it  is  delivered  to  its  destination,  un¬ 
less  resource  failures  prevent  it.  Each  router  reserves  buffer  space 
for  each  desfination,  rather  than  for  each  source-destination  ses¬ 
sion  as  it  is  customary  in  a  connection-oriented  architecture,  and 
forwards  a  received  packet  along  one  of  multiple  loop-free  paths 
towards  the  destination.  The  buffer  space  and  available  paths  for 
each  destination  are  updated  to  adapt  to  congestion  and  topology 
changes. 

Our  framework  is  based  on  three  main  architectural  elements: 
(a)  traffic  shaping  by  means  of  desfination-oriented  permit  buck¬ 
ets;  (b)  traffic  separation  and  scheduling  on  a  per  destination  basis; 
and  (c)  the  dynamic  maintenance  of  multiple  loop-free  paths  that 
always  attempt  to  reduce  the  delay  from  source  to  destination. 
Permit  buckets  consist  of  permits  or  tokens  fed  by  periodic  up¬ 
dates  of  credits.  To  schedule  packet  transmission,  we  assume  a 
packet-by-packet  generalized  processor  sharing  (POPS)  server  [9] 
at  each  node.  To  establish  loop-free  multipaths,  we  extend  prior 
results  on  loop-free  single-path  routing  algorithms  introduced  in 
[5].  This  results  in  a  congestion-oriented  multipath  routing  archi¬ 
tecture  that  uses  a  short-term  metric  based  on  hop-by-hop  credits 
to  reduce  congestion  over  a  given  link,  and  a  long-term  metric 
based  on  end-to-end  path  delay  to  reduce  delay  from  source  to 
destination.  The  main  contribution  of  this  work  is  to  illustrate  the 
provision  of  performance  guarantees  in  a  connectionless  routing 
architecture. 

Section  2.  describes  the  network  model  used  in  our  protocol. 
Section  3.  gives  a  detailed  description  of  the  new  routing  protocol. 
Section  4.  derives  worst-case  steady-state  delay  bounds  for  pack¬ 
ets  accepted  into  the  network  by  extending  the  analysis  described 
in  [10].  Section  5.  presents  our  conclusions. 

2.  Network  Model 

A  computer  network  is  modeled  as  an  undirected  finite  graph 
represented  by  G(N,  E),  where  N  is  the  set  of  nodes  and  E 
is  the  set  of  edges  or  links  connecting  the  nodes.  A  functional 
bidirectional  link  connecting  nodes  i  and  j  is  represented  as  (i,  j) 
and  is  assigned  a  positive  weight  in  each  direction.  A  link  is 
assumed  to  exist  in  both  directions  at  the  same  time.  All  routing 
messages  that  are  received  (transmitted)  by  a  node  are  put  in  the 
input  (output)  queue  on  a  FCFS  basis  and  are  processed  in  that 
order.  Each  node  is  represented  by  a  unique  identifier  and  the 
link  costs  can  vary  in  time  but  are  always  positive.  The  distance 
between  any  two  given  nodes  is  measured  as  the  sum  of  the  link 
costs  of  the  path  between  the  nodes. 

A  path  from  node  i  to  node  j  is  a  sequence  of  nodes, 
i,n\,  ...,nr,nj ,  where  (i,ni),  (nxjrix+i),  (nr,j)  are  links  in 
the  path.  A  simple  path  from  i  to  j  is  a  sequence  of  nodes  in 
which  no  node  is  visited  more  than  once.  A  multipath  from  i  to  j 
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is  a  set  of  simple  paths  from  i  to  j.  The  paths  between  any  pair 
of  nodes  and  their  corresponding  distances  change  over  time  in 
a  dynamic  network.  At  any  point  in  time,  node  i  is  connected 
to  node  j  if  a  physical  path  exists  from  i  to  j  at  that  time.  The 
network  is  said  to  he  connected  if  every  pair  of  operational  nodes 
are  connected  at  a  given  time. 

3.  Protocol  Description 

The  new  protocol  can  he  divided  into  three  functional  elements, 
namely:  packet  scheduling  and  transmission,  congestion-based 
credit  mechanism  and  maintenance  of  multiple  loop-free  paths. 

Scheduling  at  a  node  is  done  hy  maintaining  permit  bucket 
filters  at  each  node  for  all  active  destinations.  A  weighted  fair 
queueing  mechanism  is  used  for  fairness  [4].  Routing  is  done  on 
a  hop-by-hop  basis  independently  at  each  node. 

To  forward  packets  to  a  given  destination,  the  protocol  uses 
two  routing  metrics:  a  short-term  metric  based  on  hop-by-hop 
credits  to  reduce  congestion  along  a  link,  and  a  long-term  metric 
based  on  path-delay  to  minimize  end-to-end  delay. 

The  routing  variables  associated  with  each  link  is  determined 
by  periodically  monitoring  traffic  on  the  incoming  and  the  outgo¬ 
ing  links  at  each  node  through  each  neighbor.  Given  the  capacity 
of  each  link,  the  utilization  of  the  link  can  also  be  determined. 
Credits  are  reassigned  to  upstream  neighbors  depending  on  the 
traffic  flow  on  each  of  fhe  incoming  links.  A  multipath  routing 
algorithm  based  on  DUAL  [5]  maintains  multiple  loop-free  paths. 
Each  time  the  network  state  changes,  paths  are  recomputed  and 
the  new  network  state  is  obtained.  This  is  made  possible  by  the 
periodic  exchange  of  routing  information. 

Each  node  maintains  a  routing  table,  a  distance  table,  a  link 
cost  table  and  a  link  credit  table.  The  distance  table  at  node  i  is  a 
matrix  that  contains,  for  each  destination  j  and  for  each  neighbor 
k,  routing  (cost  and  credit)  information  along  with  the  distance 
reported  to  node  i  by  node  k  regarding  destination  j,  and  a 
successor  flag  (flag‘f.)  indicating  whether  neighbor  k  belongs  to 
the  shortest  multipath  set,  for  destination  j  .  Node  i’s  routing  table 
is  a  column  vector  containing  the  routing  information  about  the 
shortest  path  to  all  destinations;  it  maintains  information  about  the 
distance  (D‘ ),  successor  {s‘ ),  and  the  routing  parameters  (credits 
and  delay).  The  neighbor  nodes  used  for  packet  forwarding  from 
node  i  to  node  j  are  said  to  belong  to  the  shortest  multipath  from 
i  to  j,  denoted  by  SM^.  If  the  neighbor  node  belongs  to  SM^, 
then  flag‘f.  is  set  to  1;  otherwise  it  is  set  to  0.  The  link  cost  table 
maintains  the  distance  information  about  all  the  neighboring  links 
and  the  link-credit  table  maintains  information  about  the  credits 
available  through  all  the  neighboring  links  for  each  destination. 


3.1  Packet  Scheduling  and  Transmission  Scheme 

Packet  scheduling  is  done  by  means  of  permit  bucket  filters 
for  each  destination.  The  packet-by-packet  generalized  processor 
sharing  (POPS)  scheme  is  used  at  each  server  [9].  Packets  are 
transmitted  as  individual  entities.  A  packet  is  said  to  have  arrived 
only  after  the  last  bit  has  been  received  at  a  node.  The  server  picks 
up  the  hrst  packet  that  would  complete  service  if  no  additional 
packets  would  arrive.  Routing  is  done  on  a  per  destination  basis 
over  multiple  paths. 

Note  that,  because  all  the  nodes  along  any  path  from  a  source 
to  a  given  destination  can  contribute  to  the  flow  fo  fhaf  desfination, 
each  node  is  modeled  as  a  POPS  server  fo  regulate  the  incoming 
traffic,  instead  of  just  having  a  simple  scheduling  discipline  at  the 
intermediate  nodes  as  can  be  assumed  in  a  connection-oriented  ar¬ 
chitecture  [10].  This  scheme,  along  with  the  credit-based  conges¬ 
tion  control  mechanism,  ensures  that  the  bursty  nature  of  sources 
does  not  affect  the  routing  architecture. 

The  protocol  uses  two  routing  metrics  for  transmitting  packets 
to  a  given  destination:  a  short-term  metric  based  on  hop-by-hop 
credits  to  reduce  congestion  along  a  link,  and  a  long-term  metric 
based  on  path-delay  to  minimize  end-to-end  delay  along  the  paths. 
The  number  of  packets  sent  to  a  neighbor  depends  on  the  credits 
available  through  that  neighbor.  Credits  for  a  destination  are  sent 
from  a  destination  towards  the  source  along  the  reverse  paths 
implied  by  the  routing  tables.  When  a  node  becomes  operational, 
depending  on  the  availability  of  resources  at  each  node,  credits 
are  distributed  among  its  neighboring  nodes. 

The  traffic  at  each  node  is  regulated  by  permit-buckets,  inde¬ 
pendently  for  each  destination.  In  the  traditional  leaky  bucket 
congestion  control  scheme,  buckets  are  session  oriented.  Data 
packets  accepted  from  the  transmitter  and  the  average  rate  of  flow 
is  controlled  by  a  burst  rate  for  a  source-destination  session.  In 
our  scheme,  permit  buckets  (which  are  similar  to  leaky  buckets) 
are  destination  oriented.  Eor  a  given  destination  j,  credits  arrive 
to  a  given  node  i  at  a  rate  ,  which  is  called  the  token  generation 
rate  for  destination  j  at  node  i.  The  bucket  size,  denoted  by  cr'  (f) 
gives  the  maximum  number  of  packets  that  can  be  transmitted 
from  i  to  j  at  time  t.  This  determines  the  burstiness  of  traffic,  and 
is  defined  for  each  desfination  j  at  time  f  >  0  as 

(Ti{t)  =  ii{t)+Qi{t)  (1) 

where  1]  (f)  is  the  number  of  left-over  credits  (or  tokens)  in  the 
bucket  at  node  i  for  destination  j  at  time  t,  and  Qj(t)  is  the 
backlog  for  destination  j  at  time  t.  This  definition  is  much  the 
same  given  in  [9],  the  only  difference  being  that  here  we  maintain 
leaky-bucket  parameters  for  each  active  destination  rather  than  for 
each  session. 

Destination-based  credits  are  aggregated  at  each  node.  Each 
hop  is  considered  as  a  source;  credits  sent  by  the  downstream 
nodes  are  aggregated  at  each  hop  for  a  given  destination  and  are 
redistributed  among  its  upstream  neighbors.  The  total  available 
credits  at  each  node  for  a  given  destination  is  the  sum  of  the  credits 
received  from  its  downstream  neighbors  for  that  destination.  In 
Eigure  1,  if  z  is  the  number  of  credits  received  by  node  a  from 
its  downstream  neighbors  to  destination  j,  then  node  a  maintains 
a  permit  bucket  of  size  a:.  These  credits  are  redistributed  among 
its  upstream  neighbors  depending  on  the  traffic  flow  along  links 
(i,  a)  and  {s,a)  as  x  and  y. 

The  number  of  credifs  left  behind  denoted  by  ,  is  the  differ¬ 
ence  in  fhe  number  of  arrivals  and  fhe  number  of  credifs  fhaf 
arrive  within  a  given  time  interval.  Accordingly,  (‘(r,  f)  = 
A‘ (r,  f)  —  [K‘  (f)  —  /T‘(t)],  where  K‘{f)  is  the  number  of  credits 


Fig.  2.  Multipath  Tree 


that  arrive  at  node  i  at  time  t  for  destination  j  and  (r,  t)  is  the 
traffic  arriving  at  node  i  for  destination  j  in  the  interval  (r,  f] .  The 
total  number  of  accepted  credits  in  a  time  period  should  be  less 
than  the  credit  generation  rate.  Therefore,  with  t  <  t, 

(r)  <p){t-T)  (2) 

and 

ct’  (r,  t)  >  A]{t,  t)  -  p]  (f  -  r)  +  Q]  (r,  t)  (3) 
For  the  time  interval  (f  —  Af,  i)  we  can  write  Eq.  3  as: 

<T]{t)>A]{t)-p]{t)  +  Q]{t)  (4) 

is  related  to  the  number  of  total  available  credits  at  time 
t  and  is  the  sum  of  the  credits  available  through  all  the  nodes 
downstream  of  node  i.  Consider  Figure  2,  the  total  number  of 
credits  available  at  router  i  for  destination  j  is  the  sum  of  the  credits 
available  from  its  downstream  neighbors  a ,  6,  and  c  for  destination 
j.  The  total  number  of  packets  transmitted  to  destination  j  from 
node  i  cannot  exceed  the  total  available  credits  at  i  for  j  at  time  t. 
The  number  of  available  credits  also  depends  on  the  traffic  flow 
on  that  link  which  is  a  measure  of  the  congestion  level  of  that  link. 

3.2  Credit-Based  Congestion  Mechanism 

Congestion  over  a  given  link  is  controlled  by  a  hop-by-hop 
credit-based  mechanism.  Each  node  selects  a  path  to  a  destination 
based  on  the  bandwidth  available  through  a  given  link,  utilization 
of  that  link,  and  the  distance  to  the  destination.  The  chosen  path 
is  subjected  to  a  constraint  that  the  bandwidth  available  is  at  least 
equal  to  the  required  bandwidth,  and  the  total  bandwidth  allocated 
through  a  link  is  less  than  the  capacity  of  that  link.  The  available 
bandwidth  is  then  translated  into  credits.  A  credit  given  by  a  node 
to  its  upstream  neighbors  for  a  given  destination  represents  the 
number  of  packets  that  a  node  can  accept  from  its  neighbors  for  a 
destination.  Credits  are  sent  to  upstream  nodes  along  the  specified 
reverse  direction  of  the  routing  table. 

Figure  3  presents  a  formal  description  of  the  allocation  scheme. 
Procedure  Initialize  indicates  the  action  taken  by  a  node  when  it 
becomes  active  for  the  first  time.  Procedure  Receive  describes  the 
functions  performed  by  a  node  when  a  node  receives  a  periodic 
update. 

3.2.1  Initialization 

The  number  of  credits  available  at  each  node  is  determined  by 
the  buffer  space  available  at  that  node  (MaxBufsj).  Apartofthe 


total  available  credits  (Reserve)  is  reserved  for  fast  reservation 
mechanism.  A  fast  reservation  mechanism  is  used  to  allocate 
credits  to  a  new  node  when  it  becomes  a  part  of  the  loop-free  path 
to  a  given  destination.  This  mechanism  sends  a  minimum  number 
of  credits  to  the  new  shortest  multipath  neighbor  as  explained 
below.  This  speeds  up  the  credit  allocation  process,  thus  avoiding 
slow  start.  The  remaining  credits,  CR‘,  are  equally  distributed 
among  the  neighboring  nodes,  Ni,  on  startup.  This  is  termed  as 
the  weighted  credit  WCR‘jf..  The  delay  incurred  for  the  credits 
to  reach  neighbors  is  incorporated  while  computing  the  available 
credits. 

On  initialization,  credits  are  equally  distributed  among  neigh¬ 
bors  since  there  will  not  be  any  traffic  on  any  of  the  links  of  a 
newly  established  node.  Credits  are  dynamically  assigned  there¬ 
after  among  all  the  active  flows,  depending  on  the  traffic  flow 
through  each  of  the  links.  When  an  operational  node  i  recognizes 
that  a  new  neighbor  has  become  operational,  it  sends  a  fixed  min¬ 
imum  number  of  credits  (CRmin )  to  its  new  neighbor  indicating 
that  i  can  be  a  possible  multipath  successor.  When  node  i  selects 
neighbor  k  as  one  of  its  multiple  successors  to  a  destination,  it  sets 
the  successor  flag  in  the  update  message  (flag^f;)  to  that  neighbor 
to  indicate  that  the  neighbor  now  belongs  to  SM^ .  When  node  k 
recognizes  this,  it  includes  the  node  in  its  set  of  active  neighbors 
and  sends  minimum  credits  to  the  neighbor  and  redistributes  its 
credits  for  that  destination.  This  information  is  communicated  to 
other  nodes  in  the  next  update  interval.  The  total  credits  sent  to  the 
upstream  neighbor  is  limited  by  the  total  available  credits  at  that 
node.  Credit  information  at  each  node  is  updated  periodically. 

Each  routing  node  resets  its  traffic  counters  and  monitors  the 
incoming  and  outgoing  traffic  for  all  its  neighbors.  Based  on  this 
statistics,  the  routing  parameters  are  computed.  The  permit 
bucket  parameters  are  also  initialized  for  each  destination.  The 
token  generation  rate  p*  is  initialized  to  the  sum  of  the  credits 
available  through  all  the  neighbors  of  i  to  a  given  destination  j 
in  the  given  time  period.  The  bucket  size  a‘  is  initialized  to  the 
number  of  leftover  packets  since  on  initialization  there  will  not  be 
any  backlog. 

3.2.2  Steady  State 

A  periodic  update  timer  is  maintained  at  each  router  to  ex¬ 
change  credit  information  periodically.  Each  router  monitors  its 
traffic  on  its  incoming  and  outgoing  links  every  At  seconds  (up¬ 
date  interval).  The  update  interval  At  should  be  longer  than  the 
maximum  round  trip  time  (RTT)  delay  between  two  nodes  in  the 
network.  Each  time  an  update  is  sent,  the  timer,  timer] ,  is  reset 
(Figure  3). 

At  each  node,  credits  received  from  all  downstream  nodes  are 
aggregated  and  are  redistributed  to  the  upstream  neighbors.  This 
can  be  done  because  the  total  bandwidth  allocated  at  each  link  at 
any  given  time  is  no  more  than  the  capacity  of  that  link.  A  node  can 
send  data  packets  to  a  downstream  neighbor  only  if  the  credit  value 
through  that  neighbor  is  greater  than  zero.  Also,  because  at  each 
hop  credits  are  distributed  based  on  the  traffic  flow,  the  algorithm 
ensures  that  information  about  active  destinations  is  maintained, 
i.e.,  those  for  which  data  traffic  needs  to  flow  from  or  through  the 
node.  When  a  new  destination  for  which  the  bandwidth  is  not 
reserved  becomes  active,  or  when  a  node  becomes  a  part  of  the  set 
of  loop-free  paths  to  a  destination,  credits  are  redistributed  using 
a  fast  reservation  mechanism. 

Each  node  monitors  the  traffic  flowing  through  it  periodically 
and  determines  the  traffic  flow  on  each  of  its  links  for  all  desti¬ 
nations.  It  also  computes  the  end-to-end  delay  associated  with 


Variables: 

Pj  ^ :  credits  occupied  by  packets  in  transit 
g  * :  credits  due  to  packets  already  in  queue 

Procedure  Initialize 
when  router  i  initializes  itself 
begin 
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CR^. 

_ J 
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end 

Procedure  Receive(fc) 
when  a  periodic  update  is  received  (f : 
begin 
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jk  3  3f^  3 

Send  credit  information  to  all  x  €  at  next 
update  interval 


Fig.  3.  Credit  Distribution  Mechanism 


packets  to  each  destination.  If  the  measured  delay  does  not  sat¬ 
isfy  the  required  QoS,  that  successor  will  no  longer  he  selected 
as  a  feasible  successor  to  that  destination  and  this  information  is 
communicated  to  all  the  neighboring  nodes.  It  then  determines 
the  total  available  credits  for  a  given  destination  j  and  the  credits 
are  redistributed  among  its  upstream  neighbors  after  reserving  a 
fraction  of  the  credits  for  the  initialization  phase.  The  philosophy 
behind  this  mechanism  is  similar  to  a  fast  bandwidth  reservation 
scheme  in  which,  the  data  transmission  begins  before  a  connection 
has  been  completely  established. 

Figure  4  shows  the  distance  table  at  node  i  for  destination  j  for 
the  configuration  in  Figure  2.  The  flag  held  indicates  whether  the 
neighbor  belongs  to  the  shortest  multipath  set  or  not.  The  distance 
gives  the  sum  of  the  link  costs  along  the  path  to  destination  j  and 
credits  gives  the  number  of  available  credits  through  that  path.  A 
credit  of  0  implies  that  packets  cannot  be  forwarded  through  that 
path. 

The  number  of  credits  available  at  a  node  is  determined  by  the 
how  on  its  links  and  the  total  traffic  seen  by  a  node.  If  /j)  is  the 
incoming  How  on  link  (k,  i)  to  destination  j  as  seen  by  node  i, 
and  is  the  traffic  originated  at  i  for  destination  j,  we  dehne  the 
total  input  traffic  seen  by  i  for  destination  j  as  the  sum  of  all  the 
incoming  traffic  at  node  i,  and  denote  it  by  f] .  Furthermore,  by 
the  conservation  of  How,  the  sum  of  all  the  traffic  arriving  at  a 
node  must  be  equal  to  the  sum  of  all  the  traffic  departing  from  a 
node  for  each  destination  j.  Therefore,  for  destination  j,  the  total 
incoming  How  is  equal  to  the  total  outgoing  flow  at  node  i,  and 
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Fig.  4.  Distance  Table  at  node  i  for  destination] 


to  the  total  flow  on  all  outgoing  links  for  a  given  destination  j. 
From  Eq.  5  and  with  Ni  denoting  the  neighbor  set  of  we  have: 

=  ^  'ike  N,  (6) 

J 

Because  node  i  itself  can  also  contribute  to  the  total  traffic,  by 
the  conservation  of  flow,  it  must  be  true  that 

E  ^  1 

keSM'. 

The  distribution  of  credits  to  upstream  neighbors  depends  on 
the  traffic  flow  on  that  link,  which  in  turn  depends  on  the  routing 
variable  4’]  associated  with  that  link.  The  number  of  credits  a 
node  sends  to  an  upstream  neighbor  is  called  the  weighted  credit 
(WCR).  Credits  are  weighted  by  the  traffic  flow  on  a  given  link. 
The  token  generation  rate  for  a  given  update  period  At  can  now 
be  defined  as 


4>]k 


Credits  available  in  update  period  before  t 
Af 

VVCRLit) 

Pjit)  =  ^  - —  (8) 

To  obtain  a  correct  estimate  of  the  credits  available  at  each 
node  at  any  given  time,  we  need  to  take  into  account  the  delay 
associated  with  the  propagation  of  credits.  This  can  be  done 
either  by  estimating  the  credits  available  as  in  [6]  or  by  explicitly 
sending  a  marker.  We  opt  for  the  estimation  mechanism.  Here, 
credits  are  sent  to  the  immediate  upstream  neighbor,  i.e.,  they 
propagate  only  one  hop.  The  update  period  used  for  updating 
routing  information  is  considered  as  one  round-trip  delay  by  a  data 
packet.  Therefore,  to  obtain  a  correct  estimate  of  the  available 
credits  at  a  node,  we  have  to  take  into  account  the  data  packets 
that  the  sender  has  already  forwarded  over  the  link  for  the  past 
round-trip  time  (RTT)  and  the  data  packets  that  are  already  queued 
from  the  past  RTT.  Therefore,  the  total  available  credits  at  node 
i  for  a  destination  J  ,  denoted  by  ,  is  the  difference  between 
the  sum  of  all  the  weighted  credits  available  from  its  downstream 
neighbors  di  (equivalently,  sum  of  all  the  credits  on  its  outgoing 
links)  belonging  to  the  shortest  multipath  and  the  credits  which 
are  already  being  used,  i.e.. 


For  convenience,  a  routing  variable,  denoted  by  4>]k^  is  defined 
for  each  link  (i,  k)  as  the  ratio  of  the  flow  on  each  link  with  respect 


E  (9) 
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where  WCR'^i  is  the  weighted  credit  obtained  from  the  down¬ 
stream  neighbor  I  over  an  update  period  (which  depends  on  the 
flow  on  the  link  (i,  1)),  is  the  number  of  credits  occupied  by  the 
packets  that  are  already  in  transit  on  link  (i,  1),  and  g]  is  the  num¬ 
ber  of  credits  due  to  the  data  packets  that  are  already  in  the  queue 
at  node  i  for  destination  j  which  were  not  completely  transmitted 
since  the  previous  update  period.  If  a  node  does  not  have  credits 
for  a  given  destination  j,  then  CR‘  is  set  to  zero. 

The  correctness  of  the  credit  based  mechanism  (i.e.,  showing 
that  it  has  no  deadlocks  and  that  packets  are  not  dropped)  can  be 
proven  in  a  similar  way  as  for  virtual-circuit  connections  [8].  For 
the  purposes  of  such  a  proof,  it  must  be  assumed  that  initialization 
of  the  protocol  is  done  properly,  that  there  are  no  link  errors,  and 
that  there  are  no  link  and  node  failures. 

3.3  Maintenance  of  Loop-Free  Multipaths 

The  primary  objective  of  maintaining  multiple  loop-free  paths 
is  to  minimize  the  end-to-end  path  delay  by  reducing  network 
congestion  along  the  path.  The  distance  reported  by  neighbor  k  to 
node  i  for  destination  j  is  denoted  by  D‘f.  and  node  i’s  distance 
to  its  neighbor  k  is  denoted  by  d;* .  The  distance  to  neighbor  k  is 
the  sum  of  the  propagation  delay  SI  and  the  per  hop  packet  delay 
through  neighbor  A;,  d‘f..  i.e.,  d;*  =  d‘^f.  -f  S^.  The  path  delay  at 
node  i  along  node  A  at  a  given  time  t,  denoted  by  D‘f;  is 

D‘jk  =  AAja  +  diA 

The  shortest  multipath  set  of  i  for  destination  j  (SM‘)  are 
those  neighbors  of  i  that  provide  loop-free  paths  to  j.  The  delay 
at  node  i  to  destination  j  at  time  t  is  computed  as  the  weighted 
average  path  delay  through  all  the  nodes  in  the  shortest  multipath 
at  node  i;  it  is  denoted  by  D‘  {t).  This  delay  is  weighted  by  the 
fraction  of  the  traffic  going  through  that  path,  i.e.. 


The  flow  from  i  to  each  neighbor  in  SM]  depends  on  the 
credits  available  through  that  neighbor.  Assuming  that  packets 
are  of  fixed  size  and  that  each  packet  corresponds  to  one  credit, 
we  can  say  that  a  packet  flows  on  a  link  if  at  least  one  credit  is 
available  on  that  link.  This  implies  that  the  number  of  packets 
that  can  flow  on  a  link  is  equal  to  the  number  of  credits  available 
through  that  link;  therefore, 

=  ^  E  [WCR),{t).D],{t)]  (11) 

^  keSM'.(t) 

If  the  packets  are  of  variable  lengths,  the  packet  length  is  a 
multiple  of  credits  and  for  simplicity  we  can  assume  that  each 
packet  requires  C  credits  on  an  average.  The  total  number  of 
packets  transmitted  along  a  link  with  credits  is  a  constant 

K  times  the  total  available  credits;  therefore, 

E  WCR),{t).b],{t)  (12) 

^  lesM'(t) 

Multiple  loop-free  paths  from  each  node  to  a  destination  are 
maintained  by  means  of  a  shortest  multipath  routing  algorithm 
(SMRA),  which  is  based  on  DUAL  [5].  Any  change  in  distance 
is  notified  by  event-driven  update  messages.  An  update  message 


from  router  i  consists  of  a  vector  of  entries;  each  entry  specifies  a 
destination  j,  an  update  flag,  a  successor  flag,  the  reported  distance 
to  that  destination  and  the  reported  credits  available  to  destination 
j  through  that  neighbor.  The  update  flag  indicates  whether  the 
entry  is  an  update  (u‘  =  0),  a  query  (u‘  =  1)  or  a  reply  to  a  query 
(u‘  =  2). 

A  detailed  specification  of  SMRA  is  given  in  [7].  A  router  i 
can  be  active  or  passive  for  destination  j  at  any  given  time.  Node  i 
is  active  for  destination  j  if  it  is  waiting  for  at  least  one  reply  from 
a  neighbor,  and  is  passive  otherwise.  A  router  i  initializes  itself  in 
passive  state  with  an  infinite  distance  to  all  its  known  neighbors 
and  a  zero  distance  to  itself.  The  maximum  allowable  distance 
to  reach  neighbor,  defined  below,  is  also  set  to  oo.  Routers  send 
updates  containing  distance  and  credit  information  for  themselves 
to  all  their  neighbors.  When  the  destinations  become  operational, 
routers  inform  their  neighbors  about  the  available  credits  to  all 
other  nodes. 

Credit  information  is  updated  periodically  while  the  distance 
information  is  exchanged  among  neighbors  when  the  state  of 
the  network  changes.  Each  routing  update  updates  the  cost  and 
the  credit  information.  An  update  can  be  a  full  routing  table  or 
increments  of  the  routing  table  in  different  update  messages.  After 
initialization,  only  incremental  updates  are  sent. 

For  a  given  destination,  a  router  updates  its  routing  table  dif¬ 
ferently  depending  on  whether  it  is  passive  or  active  for  that 
destination.  A  router  that  is  passive  for  a  given  destination  can 
update  the  routing-table  entry  for  that  destination  independently 
of  any  other  routers,  and  simply  chooses  as  its  new  distance  to  the 
destination  to  be  the  shortest  distance  to  that  destination  among  all 
neighbors,  and  as  its  new  feasible  successor  to  that  destination  to 
be  any  neighbor  through  whom  the  shortest  distance  is  achieved. 
In  contrast,  a  router  that  is  or  becomes  active  for  a  given  destina¬ 
tion  must  synchronize  the  updating  of  its  routing-table  entry  with 
other  routers. 

When  a  router  is  passive  and  needs  to  update  its  routing  table 
for  a  given  destination  j  after  it  processes  an  update  message  from 
a  neighbor  or  detects  a  change  in  the  cost  or  availability  of  a  link 
or  a  change  in  the  credit  information,  it  tries  to  obtain  a  feasible 
successor.  From  router  i’s  standpoint,  a  feasible  successor  toward 
destination  j  is  a  neighbor  router  k  that  satisfies  the  maximum 
allowable  distance  condition  (MADC)  given  by  the  following  two 
equations  [5]: 

D]  =  + dik  =  Min{D]p  +  dip\p  £  Ni} 

<  MAD]  (13) 

where  MAD]  is  the  maximum  allowable  distance  for  destination 
j,  and  is  equal  to  the  minimum  value  obtained  for  D]  since 
the  last  time  router  i  transitioned  from  active  to  passive  state 
for  destination  j.  Router  i  adjusts  MAD]  depending  on  the 
congestion  level  of  the  network. 

If  router  i  finds  a  feasible  successor,  it  remains  passive  and 
updates  its  routing-table  entry  as  in  the  Distributed  Bellman-Ford 
algorithm  [2].  Alternatively,  if  router  i  cannot  find  a  feasible 
successor,  if  first  sets  its  distance  equal  to  the  addition  of  the 
distance  reported  by  its  current  successor  plus  the  cost  of  the 
link  to  that  neighbor.  The  router  also  sets  its  maximum  allowable 
distance  equal  to  its  new  distance.  After  performing  these  updates, 
the  router  becomes  active  by  sending  a  query  in  an  update  message 
to  all  its  neighbors;  such  a  query  specifies  the  router’s  new  distance 
through  its  current  successor.  It  then  sets  the  destination’s  reply- 
status  table  entry  for  each  link  to  one,  indicating  that  it  expects  a 


reply  from  each  neighbor  for  that  destination. 

Once  active  for  destination  j,  router  i  cannot  change  its  fea¬ 
sible  successor,  MAD^,  the  value  of  the  distance  it  reports  to  its 
neighbors,  or  its  entry  in  the  routing  table,  until  it  receives  all  the 
replies  to  its  query.  A  reply  received  from  a  neighbor  indicates 
that  such  a  neighbor  has  processed  the  query  and  has  either  ob¬ 
tained  a  feasible  successor  to  the  destination,  or  determined  that  it 
cannot  reach  the  destination.  Once  node  i  obtains  all  the  replies  to 
its  query,  it  computes  a  new  distance  and  successor  to  destination 
j,  updates  its  feasible  distance  to  equal  its  new  distance,  and  sends 
an  update  to  all  its  neighbors. 

Multiple  changes  in  link  cost  or  availability  are  handled  by 
ensuring  that  a  given  node  is  waiting  to  complete  the  processing 
of  at  most  one  query  at  any  given  time.  The  mechanism  used  to 
accomplish  this  is  specified  in  [5],  and  is  such  that  a  node  can  be 
either  passive  or  in  one  of  four  active  states,  and  it  processes  any 
pending  update  or  distance  increases  that  occurred  while  it  was 
active.  The  state  of  node  i  for  destination  j  is  denoted  by  the  flag 

Ensuring  that  updates  stop  being  sent  in  the  network  when 
some  destination  is  unreachable  is  easily  done.  If  node  i  has  set 
D‘  =  oc  already  and  receives  an  input  event  (a  change  in  cost  or 
status  of  link  (i,  k),  or  an  update  or  query  from  node  k)  such  that 
-f  diA  =  oo ,  then  node  i  simply  updates  or  d;* ,  and  sends 

a  reply  to  node  k  with  RD‘  =  oo  if  the  input  event  is  a  query 
from  node  k.  When  an  active  node  i  has  an  infinite  maximum 
allowable  distance  and  receives  all  the  replies  to  its  query  such 
that  every  neighbor  offers  an  infinite  distance  to  the  destination, 
the  node  simply  becomes  passive  with  an  infinite  distance. 

When  node  i  establishes  a  link  with  a  neighbor  k,  it  updates 
the  value  of  di*  and  assumes  that  node  k  has  reported  infinite 
distances  to  all  destinations  and  has  replied  to  any  query  for  which 
node  i  is  active.  Furthermore,  if  node  A;  is  a  previously  unknown 
destination,  node  i  sets  o%  =  I,  s\  =  null,  and  Dl  =  RDl  = 
MAD],  =  oo.  Node  i  also  sends  to  its  new  neighbor/:  an  update 
for  each  destination  for  which  it  has  a  hnite  distance. 

When  node  i  is  passive  and  detects  that  link  {i,  k)  has  failed, 
it  sets  dik  =  oo  and  D‘f.  =  oo.  After  that,  node  i  carries  out 
the  same  steps  used  for  the  reception  of  a  link-cost  change  in  the 
passive  state. 

Because  a  router  can  become  active  in  only  one  diffusing 
computation  per  destination  at  a  time,  it  can  expect  at  most  one 
reply  from  each  neighbor.  Accordingly,  when  an  active  node  i 
loses  connectivity  with  a  neighbor  n,  node  i  can  set  »■]„  =  0  and 
D‘^  =  oo,  i.e.,  assume  that  its  neighbor  n  has  sent  any  required 
reply  reporting  an  infinite  distance.  If  node  n  is  s] ,  node  i  also 
sets  o‘  =  0.  When  node  i  becomes  passive  again  and  o*  =  0, 
it  cannot  simply  choose  a  shortest  distance;  rather,  it  must  find  a 
neighbor  that  satisfies  the  MADC  using  the  value  of  MAD‘  set 
at  the  time  node  i  became  active  in  the  first  place.  After  Ending 
a  new  successor,  the  permit  bucket  parameters  pj  and  a‘  are  also 
updated. 

Figure  5  gives  a  graphical  representation  of  how  MAD  is  up¬ 
dated.  The  point  at  which  a  new  diffusing  computation  starts  is  a 
synchronization  point.  It  can  be  noted  that  between  two  synchro¬ 
nization  points  the  value  of  MAD  can  only  decrease  or  remain  the 
same. 

To  route  packets  to  a  destination  j,  each  router  uses  the  fol¬ 
lowing  rule  to  select  the  neighbor  routers  that  should  belong  to  its 
shortest  multipaths  for  j: 

Shortest  Multipath  Condition  (SMC):  At  time  t,  router  i  can  make 
node  k  £  Ni(t)  part  of  SM‘  if  and  only  if  D‘jk(t)  <  MAD‘j(t). 


Point  time 

Fig.  5.  Maximum  Allowable  Distance  Condition 


When  nodes  choose  their  successors  using  SMC,  the  path  from 
source  to  destination  obtained  as  a  result  of  this  is  loop  free  at  every 
instant.  The  proof  of  correctness  and  loop-freedom  of  SMRA  is 
basically  the  same  as  that  provided  in  [5]  for  DUAL. 

4.  Worst-Case  Steady-State  Delays 

In  this  section,  we  derive  an  upper  bound  on  the  end-to-end 
steady-state  path  delay  from  node  i  to  destination  j  (D‘* )  as  a 
function  of  the  credits  available  through  each  path  under  steady 
state.  Steady-state  means  that  all  distances  and  credit  information 
is  correct  at  every  router.  This  bound  demonstrates  that  it  is  possi¬ 
ble  to  provide  performance  guarantees  in  a  connectionless  routing 
architecture.  The  delay  experienced  by  a  packet  accepted  into  the 
network  is  the  time  required  by  a  data  packet  to  reach  its  desti¬ 
nation  router  from  a  source.  This  includes  both  the  propagation 
delay  and  the  queueing  delay.  Path  delay  can  also  be  interpreted 
as  the  time  it  would  take  for  a  destination  j  backlog  to  clear  when 
there  are  no  more  arrivals  after  time  t. 

Parekh  and  Gallager  have  analyzed  worst-case  session  delay 
in  a  connection-oriented  network  architecture  [10].  We  adopt  a 
similar  approach  for  each  destination  in  a  connectionless  architec¬ 
ture.  To  do  this,  we  assume  a  stable  topology  in  which  all  routers 
have  finite  distances  to  each  other.  We  also  make  use  of  the  fact 
that  SMRA  enforces  loop-freedom  at  every  instant  on  all  paths  in 
the  shortest  multipath  sets. 

In  a  connectionless  network  where  routes  are  computed  dis- 
tributedly,  the  path  taken  by  a  packet  can  change  dynamically 
depending  on  the  congestion  level  in  the  network.  Routing  is 
done  on  a  hop-by-hop  basis,  independently  at  each  router.  There¬ 
fore,  the  total  traffic  at  a  node  will  be  the  sum  of  the  traffic  on  all  its 
links  connecting  to  upstream  neighbors.  To  obtain  an  expression 
for  the  worst-case  bound,  we  make  the  following  assumptions: 

1 .  Each  node  sends  traffic  to  destination  j  as  long  as  credits 
are  available  (non-zero)  for  that  destination  along  any  of  its 
chosen  paths. 

2.  At  every  node  m,  traffic  for  every  destination  is  treated 
independently. 

3.  Traffic  arriving  at  a  node  i  for  destination  j  in  the  interval 
(0,  t)  (denoted  by  A‘  )  is  the  sum  of  the  traffic  from  all  its 
upstream  neighbors  to  destination  j  and  the  traffic  originated 
at  the  node  i  itself,  denoted  by  r‘  (f),  i.e., 

(14) 

=  (15) 

l\ieSM‘  (t) 


Each  router  in  a  connectionless  network  can  itself  be  a  source 
to  any  given  destination.  At  each  node,  traffic  to  destination  j  is 
constrained  by  a  permit  bucket  filter.  The  worst-case  delay  and 
backlog  is  upper  bounded  by  an  additive  scheme  due  to  Cruz  [3]. 
The  rate  at  which  the  packets  are  serviced  at  each  node  depends 
on  the  permit  bucket  or  leaky  bucket  parameters  ct*  and  for 
a  given  destination  j.  The  parameter  cr’  gives  the  permit  bucket 
size  and  p]  the  credit  generation  rate  at  node  i.  Therefore,  the 
number  of  packets  that  are  being  serviced  at  a  node  is  a  function 
of  a‘j  and  . 

The  minimum  service  rate  at  any  node  i  is  the  fraction  of 
the  input  traffic  at  node  i  for  destination  j  .  The  fraction  of  the 
traffic  is  determined  by  the  ratio  of  the  routing  variables  of  the 
links,  which  is  a  function  of  the  traffic  flow;  Therefore, 


•S'] (t,  t)>{t-  T)p]{t  -  r)  (20) 

4.1  Negligible  Packet  Size 

We  first  obtain  a  bound  on  end-to-end  path  delay  assuming 
that  the  size  of  the  packet  may  not  contribute  significantly  to  the 
delay  component.  The  arrivals  at  each  node  i  is  the  sum  of  the 
arrivals  at  all  the  upstream  nodes  for  destination  j  and  the  traffic 
originated  at  node  i  itself.  For  all  f  >  r  >  0  we  have, 

(t  f)  =  »■]  (t  f)  +  ^  (21) 

ileGSM^  (t) 
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The  minimum  clearing  rate  of  a  given  path  is  Qi  = 

-  y  J 

minm6P(i,j)  When  g;  >  the  system  with  respect  to 

destination  j  is  said  to  be  locally  stable.  The  input  traffic  rate 
at  node  i  to  destination  j  is  the  sum  of  all  the  incoming  traffic 
destined  for  j  for  which  i  is  the  intermediate  node  and  the  traffic 
originated  at  i  itself  (Eq.  5).  With  these  constraints,  the  bound  on 
the  delay  for  a  given  destination  can  be  obtained  using  a  similar 
approach  as  in  [10]. 

The  delay  on  a  link  (i,  k)  (per  hop  delay)  d‘7.  for  a  given 
destination  j  is  the  sum  of  the  queueing  delay  and  the  propagation 
delay  on  that  link.  The  link  propagation  delay  (5^)  depends 
on  the  congestion  level  of  the  link  as  well  as  the  link  capacity. 
Propagation  delay  is  defined  as  the  time  taken  for  a  packet  to 
reach  a  destination  from  a  source.  Every  packet  is  time-stamped 
when  it  leaves  a  node  and  the  time  at  which  the  packet  reaches 
the  neighbor  is  noted.  The  difference  between  the  two  gives  a 
one-hop  delay.  The  average  of  this  delay  over  a  given  period  of 
time  gives  the  propagation  delay  SI. . 

The  queueing  delay  is  the  time  a  packet  has  to  wait  at  a  node 
before  it  is  processed.  The  waiting  time  of  a  packet  depends  on 
the  number  of  packets  already  present  in  the  queue  at  the  time 
a  packet  arrives.  This  is  referred  to  as  the  backlog  at  node  i  for 
destination  j  and  is  denoted  by  Q‘ .  Therefore,  the  delay  on  link 
(i,  1)  for  destination  j  at  time  t  is 


=  Si(t)  +  Q],(t).Si(t)  =  (f)[l  -f  Q],(t)]  (17) 


The  maximum  backlog  traffic  Q‘*  for  destination  j  is  the 
difference  between  the  arrivals  in  the  interval  (r,  f]  and  the  total 
packets  serviced  in  the  same  interval  at  node  i.  For  j  >  0, 


QT  (t,  t)  <  A]  (r,  t)  -  S]{t,  t)  (22) 

Q7  ("!■,  f)  <  r]{T,  t)  -  S‘  (r,  t) 

+  (23) 

i|i6SMj(t) 

Qrir,t)  < 

+  X/  (24) 

The  difference  (r*  (f)  —  r‘  (r))  determines  the  amount  of  traffic 
arriving  at  node  i  in  the  interval  (f  —  r) ;  the  maximum  of  which  is 
the  sum  of  the  tokens  available  at  node  i  and  the  tokens  received  in 
the  interval  ( f  —  r) .  At  every  node,  each  destination  is  constrained 
independently  by  a  permit  bucket  scheme.  Following  Parekh 
and  Gallager’s  approximation  [10],  we  assume  the  links  to  be 
of  infinite  capacity.  The  results  for  the  infinite  capacity  case 
upper-bound  the  finite  capacity  case.  In  other  words,  the  results 
of  infinite  capacity  can  be  used  for  any  finite  speed  link.  The 
arrival  and  the  service  functions  at  each  router  can  be  translated  to 
permit  bucket  parameters  which  in  turn  depend  on  the  maximum 
tolerable  path  delay  and  the  link  flows.  Substituting  for  the  arrivals 
and  the  number  of  packets  serviced  in  terms  of  the  permit  bucket 
parameters  from  the  previous  section  we  have 


The  backlog  number  of  packets  for  a  given  destination  j  at  a 
given  time  t  can  be  defined  as  the  difference  in  the  incoming  and 
the  outgoing  traffic  at  a  node,  i.e., 

Q]it)  =  A^it)  -  S‘{t)  (18) 

This  takes  into  account  both  the  processing  delay  and  the 
queueing  delay  experienced  at  each  hop.  For  every  interval  (t,  f] , 

>  (t  -  T)gi  (19) 

I 

If  the  minimum  clearing  time  is  greater  than  the  token 
generation  rate  p]  for  a  given  destination,  we  can  obtain  a  bound 
on  the  backlog  and  hence  the  path  delay.  Let  r  <  f  be  the 
time  at  which  there  are  no  backlogged  packets  in  the  network. 
Then,  because  g]  >  p]  and  all  the  destinations  are  permit  bucket 
constrained. 


+  ^  ~ 
l\ieSM‘(t)  ^ 

Because  ^  <  1  for  any  j  and  I  £  SM‘  (t), 

3 

f)  <  o-‘(f-T)-|-  ^  [al(t-T)  +  p‘j(t-T)]  (25) 

t  I  ieSM‘(t) 

Making  t  =  t  —  At,  we  can  write 

I  I  ieSM‘  (t) 


Therefore,  the  backlog  at  node  i  to  destination  j  depends  on 
the  leaky  bucket  parameters  at  node  i  and  the  permit  bucket  pa¬ 
rameters  of  all  the  upstream  neighbors  of  i  for  which  node  i  is  in 
the  shortest  multipath  set. 

The  delay  at  each  node  i  can  be  computed  as  the  weighted 
average  path  delay  through  all  its  multipath  neighbors;  therefore, 

D]it)=  (26) 

keSM'(t) 

The  distance  from  i  to  j  through  neighbor  k  can  be  expressed 
as  the  sum  of  the  distance  from  k  to  j  and  the  link  cost  from  i  to 
k.  The  link  cost  is  the  sum  of  the  distance  and  the  propagation 
delay  of  that  link.  Therefore, 

J  kit) +  dik{t)  =  D‘^^{t)  +  [d^f^{t)  + Slit)] 

=  I]  ^]k(t)[Dtk(t)  +  (dUt)  +  dl)(t)]  (27) 

keSM^{t) 

FromEq.  17,  dik(t)  =  +  Q*  (t)],  which  implies  that 

D‘(t)=  ^U(Wk(t)  +  ium  +  Q‘(m  (28) 

keSM'.(t) 

Because  SMC  must  be  satisfied  by  every  k  £  SM‘{t), 
D^kit)  <  MAD‘(t).  Then,  if  Dj(t)  is  the  maximum  path 
delay  from  i  to  j  at  time  r  and  Q‘*  (t)  is  the  maximum  backlog 
from  i  to  j  at  time  t,  we  obtain  from  Eq.  28  that 

DTit)  <  Y  [ti>]kit).sum  +  QTm] 

keSM'.(t) 

+MAD]{t)  E  4>)k{t)  (29) 

keSM'.(t) 

Let  the  maximum  link  propagation  delay  of  all  the  links  from 
i  to  a  node  in  (f)  be 

A}(t)  =  max  (30) 

keSM^.{t) 

Therefore,  the  maximum  path  delay  from  i  to  j  becomes 
D]*{t)  <  A]{t)  Y  + 

keSM^-{t) 

E  4>]k{t)  (31) 

keSM'.(t) 

Noticing  that  <5‘*  (  t)  is  independent  of  k  and  substituting  Eq.  7  in 
Eq.  31  we  obtain 

<  A;(f)[l  +  Q‘;(t)]  +  MAD]{t)  (32) 

The  above  equation  is  an  upper  bound  on  D‘  (f)  that  should 
be  expected.  It  states  that  D]  [t)  must  be  smaller  than  the  sum  of 
the  product  of  the  backlog  for  i  at  node  i  times  the  maximum  link 
propagation  delay  in  node  i’s  shortest  multipath,  plus  MAD‘{t). 
The  first  term  of  Eq.  32  corresponds  to  the  delay  incurred  by 
sending  all  backlogged  packets  at  time  f  to  a  neighbor  with  the 


longest  link  propagation  delay.  The  second  term  corresponds 
to  the  maximum  delay  incurred  by  any  neighbor  receiving  the 
backlog  packets;  because  any  such  neighbor  must  be  on  (t), 
that  delay  can  be  at  most  equal  to  MAD^  (I). 

Substituting  Eq.  25  in  Eq.  32,  we  can  represent  the  same  bound 
in  terms  of  permit  bucket  parameters  as  follows: 

DY[t)  <  +  Y  + 

l\ieSM‘.(t) 

+MADt{t)  (33) 

The  bound  given  by  Eqns.  32  and  33  for  router  i  is  based  on 
a  maximum  delay  offered  by  the  neighbor  of  i  and  a  maximum 
backlog  allowed  at  router  i.  This  is  possible  because  of  two  main 
features  of  SMRA:  datagrams  are  accepted  only  if  routers  have 
enough  credits  to  ensure  their  delivery,  and  datagrams  are  deliv¬ 
ered  along  loop-free  paths.  In  contrast,  in  traditional  datagram 
routing  architectures,  any  datagram  presented  to  a  router  is  sent 
towards  the  destination,  and  the  paths  taken  by  such  datagrams 
can  have  loops;  therefore,  it  is  not  possible  to  ensure  a  finite  delay 
for  the  entry  router  or  any  relay  router  servicing  a  datagram. 

4.2  Non-negligible  Packet  Size 

In  POPS  networks,  routing  nodes  do  not  transmit  packets  until 
a  packet  has  completely  arrived.  Therefore,  the  number  of  packets 
which  will  reach  a  downstream  node  is  at  the  most  equal  to  the 
number  of  packets  serviced  by  its  upstream  neighbors.  Let  Li  be 
the  maximum  packet  size  at  node  i.  The  POPS  server  does  not 
begin  servicing  a  packet  until  the  last  bit  has  arrived. 

Eor  a  packet-switched  network 

t)  =  r]  (r,  f)  -f  Y  (t"-  t)  (34) 

m  I  i^SMV‘'[t) 

Here,  S^(t,  t)  represents  the  number  of  packets  serviced  by 
an  upstream  neighbor  m  for  which  i  is  in  the  shortest  multipath  to 
j  in  the  interval  (f  —  r).  Let  K  be  the  number  of  hops  in  a  given 
path  from  i  to  j;  m  and  m  —  1  be  two  successive  nodes.  Then, 
for  a  given  path, 

Y^Tm~\Lt)  >  ATir,t)-rT{r,t) 

m—l 

>  (35) 

m—l 

where,  m  =  2,  K,  t  <  t  and  Lm-i  is  the  maximum  length 
of  a  packet  transmitted  by  node  m  —  1 .  Here,  the  nodes  m  and 
(m  —  1)  are  such  that  m  £  . 

For  a  POPS  system,  the  number  of  packets  serviced  for  f  >  r 
is  given  as 

S‘(t,  t)  >  min  {[A]  (r,  V)  -  r]  (r,  V)] 

+Gf  (f  -  V)}  +  K.Li  (36) 

where  V  represents  the  last  time  in  the  interval  [r,  f]  at  which 
node  i  begins  a  busy  period  for  destination  j  and  the  function  Gf 
is  a  convex  function  which  indicates  the  amount  of  service  given 
to  destination  j  under  a  greedy  regime. 


5.  Conclusion 


S‘(0,t)  >^mm{[A‘(0,V)-r‘(0,V)]  +  Gf(t-V)}  +  K.L.  (37) 

With  a  greedy  regime,  the  service  to  destination  j  is  minimized 
and  is  delayed  hy  an  appropriate  amount,  which  is  given  hy  the 
minimizing  value  of  V,  denoted  hy  Vmin  ■ 

S‘(t)  >  {[A‘(V„.n)  -  r‘(V„.n)]  +  Gf  (t  -  V„in)}  +  K.L.  (38) 

The  backlog  traffic  for  a  destination  j  from  i  is  the  difference 
between  the  number  of  packets  that  has  arrived  and  the  number  of 
packets  serviced  as  in  the  previous  case. 

Q]{0,t)  =  A]{0,t)-S;(0,t)  (39) 

Applying  similar  argument  as  in  the  previous  section,  we  have 

i\sM^{t)ei 

Substituting  for  Sj  (t)  we  have, 

=  pi(t)-  pi(Vmin)-GJ‘(t-Vrr,in)  +  m.Li 

+  Y  HW  +  P*W]  (41) 

i\sM'.{t)  et 

Thus,  the  maximum  backlog  is  given  by 

=  pi{t)  -  piiVmin)  +  rn.Lma:,  -  GJ‘ {t  -  Vmin) 

+  Y  +  (42) 

i\sM^{t)ei 

Having  bound  the  worst-case  backlog,  we  can  use  a  similar 
approach  as  in  Eq.  31  to  obtain  bounds  for  the  maximum  path 
delay.  Since  we  are  considering  a  PGPS  system,  the  expression 
for  maximum  path  delay  becomes 

MAD‘(t)  +  A^^(t)  Y,  [l  +  <?r(0]  (43) 

l\ieSM^^  (t) 

Substituting  for  the  maximum  backlog  from  Eq.  42  we  obtain, 

D^/(t)  <  Y  {l  +  pj(i)  - 

i|eG5Mj(t) 

-\-7Tl.L/jrtax  ~  {t  —  Vrnin) 

+I2hw+p‘w]>  (44) 

Dj*  <  MAD^^(t)+A){t){l  +  p){t)-  p)(Vmin) 

-{-tn.Ljnax  Vjfiin) 

+  Y  KV)  +  ^’‘V)]>  (45) 

Here  again,  the  excess  delay  experienced  by  a  packet  depends 
on  the  network  traffic  as  earlier.  In  addition,  it  also  is  a  function 
of  the  packet  size  and  depends  on  the  entire  path  from  source  to  a 
given  destination  node. 


We  have  presented  a  new  framework  for  the  modeling  of  mul¬ 
tipath  routing  in  connectionless  networks  that  dynamically  adapt 
to  network  congestion.  We  have  demonstrated  that  it  is  possi¬ 
ble  to  provide  performance  guarantees  for  the  delivery  of  packets 
in  such  networks.  The  basic  routing  protocol  uses  a  short-term 
metric  based  on  hop-by-hop  credits  to  reduce  congestion  and  a 
long  term  metric  based  on  end-to-end  path  delays  from  a  source 
to  a  destination.  Packet  forwarding  is  done  on  a  hop-by-hop 
basis.  Each  node  is  modeled  as  a  PGPS  server  which  contains 
destination-based  permit  buckets.  A  loop-free  multipath  routing 
protocol  has  been  proposed  to  regulate  the  traffic  on  each  link 
by  monitoring  the  parameters  of  the  destination-oriented  permit 
buckets.  A  worst-case  delay  bound  under  steady-state  has  been 
derived  for  the  above  network  model  for  both  negligible  and  non- 
negligible  packet  sizes. 

Our  work  continues  to  study  the  dynamic  behavior  of 
congestion-oriented  shortest  multipath  routing,  and  to  define  how 
destination-oriented  routing  mechanisms  can  be  used  to  satisfy 
performance  requirements  specified  by  the  sources  of  packets. 
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